/*

Replication code for "Fuel Subsidies Limit Democratization: 
Evidence from a Global Sample, 1990-2014"

Forthcoming at International Studies Quarterly

Author: Matthew D. Fails
fails@oakland.edu

This code replicates all analyses reported in the paper and supplemental material

The analysis requires the installation of two Stata packages: 'coefplot' and 'ivreg2h'
Both can be installed by typing 'ssc install [package name]'

This code makes reference to two different replication datasets. 

	1) The first is the main replication dataset, and is used for lines 31 through 214. 
	2) The second is a separate dataset in multiple imputation format. This corresponds
	to the Supplemental Material, Part B, section III (model 16 specifically). 

This replication dataset includes all countries in the world from 1990 - 2014. However, the analysis 
limits the sample to country-years where an increase in the level of electoral democracy is possible. 
This eliminates all countries that already score the maximum 1.0 on the polyarchy index (ordinal), as well
as countries lacking domestic autonomy. The variable 'Vsamp' is a binary variable distinguishing 
eligible country years (1) from non-eligible country years. All replication code limits the analysis
to the appropriate sample with the "if Vsamp==1" code.
  
*/

clear
version 14 
set more off 

use "c:\data\Subsidies\ISQRepData.dta", replace

*Description of all variables/variable labels
des

*create Figure 1

twoway (scatter meansubs meanoil if year==2012, mlabel(country) ylabel(0(2)6) xtitle("Average Oil Income Per Person (log), 1990-2014") ytitle("Average Fuel Subsidies Per Person in USD (log)" "1990-2014") scheme(s1mono) graphregion(fcolor(white)) legend(order( 2 "Est. Sample Linear Fit" 3 "Est. Sample Linear Smoothed")) ///
	title("Oil Income and Fuel Subsidies"))(lfit meansubs meanoil if year==2012 & Vsamp==1, clpattern(longdash) clwidth(thick))(lowess meansubs meanoil if year==2012 & Vsamp==1, clpattern(shortdash) clwidth(thick))
	

**Table 1
*Model 1
probit poly5dv l.oilincome l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 2
probit poly5dv l.oilincome l.subUSD l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 3
probit poly5dv oilcm l.oildev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	

*Model 4
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	

*Create Figure 2
*Marginal effects for main variable (subUSDcm in model 4)
sum subUSDcm if e(sample)
centile subUSDcm if e(sample), centile(10 25 50 75 90)
*for full range of main variable
margins, at(subUSDcm=(0(.5)5.88)) atmeans level(90)
marginsplot, recast(line) recastci(rarea) scheme(s1mono) graphregion(fcolor(white)) xtitle("Fuel Subsidies Country Mean") ytitle("Probability that DV = 1") title("Substantive Effect of Fuel Subsidies (country means), full range") subtitle("All other variables held at mean; 90% CI's reported")
*for 10th - 90th percentile of main variable
margins, at(subUSDcm=(0(.25)2.4)) atmeans level(90)
marginsplot, recast(line) recastci(rarea) scheme(s1mono) graphregion(fcolor(white)) xtitle("Fuel Subsidies Country Mean") ytitle("Probability that DV = 1") title("Substantive Effect of Fuel Subsidies (country means), 10th - 90th percentile") subtitle("All other variables held at mean; 90% CI's reported")

*estimate predicted probability of transition with all variables in model set to mean values
margins, atmeans level(90)

*estimate margins for some comparisons in text
centile WBgrow_new if e(sample), centile(10 25 50 75 90)
margins, at(WBgrow_new=(.28 4.9)) atmeans level(90)
margins, at(WBgrow_new=(-2.6 7.6)) atmeans level(90)

*Create Figure 3
*re-run model 4 from Table 1 (model must be re-estimate before each margins command)
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
tabstat WBgrow_new subUSDcm oilcm lincome_new pastpoly5 if e(sample), stat(mean sd q iqr) 

margins, at(subUSDcm=(0.03 1.5)) atmeans level(90) contrast(at effects) post
estimates store margins_1

probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
margins, at(WBgrow_new=(.28 4.9)) atmeans level(90) contrast(at effects) post
estimates store margins_2

probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
margins, at(oilcm=(0 4.1)) atmeans level(90) contrast(at effects) post
estimates store margins_3

probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
margins, at(L.lincome_new=(7.5 9.1)) atmeans level(90) contrast(at effects) post
estimates store margins_4

probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
margins, at(pastpoly5=(1 3)) atmeans level(90) contrast(at effects) post
estimates store margins_5

*generate Figure 3 - note: must have 'coefplot' installed
coefplot ///
(margins_1, label(Fuel Subsidies, country means)) ///
(margins_2, label(Economic Growth)) ///
(margins_3, label(Oil Income)) ///
(margins_4, label(Income Per Capita)) ///
(margins_5, label(Sum of Past Transitions)), xline(0) level(90) xtitle(Change in Predicted Probability per IQR increase) ///
title("Estimates of Parameters of Interest from Model 4") ytitle("Explanatory Variables") ///
ylabel(.) note("Note: Estimates are expressed as 90% confidence intervals of the change in the predicted probability" "of a democratic transition per interquartile range increase in each explanatory variable. Predicted" "probabilities calculated while holding all other explanatory variables at mean values.")


*Create Table 2 (Lewbel IV approach)
*must have 'ivregh2' installed

*generate year dummies ('ivreg2h' does not handle factor variables)
tabulate year if year>=1990, generate(g)

*Model 5
ivreg2h poly5dv l.oilincome (l.subUSD = ) l.lincome_new WBgrow_new L.poly5 pastpoly5 g1-g23 aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 6
ivreg2h poly5dv oilcm l.oildev (subUSDcm  =  ) l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 aclp_region g1-g23 vtim vtim2 vtim3 if Vsamp==1, cluster(ccode) first ffirst


***Supplemental File Materials
*Table 1A - Main Results Robustness
*re-estimate model without subsidies, but using the subsidy sample
probit poly5dv l.oilincome l.subUSD l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)
gen insamp =1 if e(sample)
*Model 1 - Same as Model 1, but with Model 2 Sample
probit poly5dv l.oilincome l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1 & insamp==1, cluster(ccode)

*Model 2 - Same as Model 2, but excluding oil income
probit poly5dv l.subUSD l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 3 - Same as Model 4, but excluding oil income
probit poly5dv subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	


*Table 2A - Linear Probability Model Version of Table 1
*Model 4
reg poly5dv l.oilincome l.subUSD l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)
*Model 5
reg poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	

*Table 3A - Probit results, using 3 point change in Polity2 as marker of DV = 1
*Model 6
probit poldv l.oilincome l.subUSD l.lincome_new WBgrow_new L.polity2 pasttrans i.year aclp_region ptim ptim2 ptim3 if Vsamp==1, cluster(ccode)
*Model 7
probit poldv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.polity2 pasttrans year aclp_region ptim ptim2 ptim3 if Vsamp==1, cluster(ccode)	

*Table 4A - Robustness to using Price Gap measure of subsidies
*note, price gap truncates negative values to 0 following same logic of main subsidy measure construction (i.e. negative values represent taxes on consumpion)
*Model 8
probit poly5dv l.oilincome l.pdiff0 l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 9
probit poly5dv oilcm l.oildev pdiff0cm l.pdiff0dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	

*Model 10
ivreg2h poly5dv oilcm l.oildev (pdiff0cm  =  ) l.pdiff0dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 aclp_region g1-g23 vtim vtim2 vtim3 if Vsamp==1, cluster(ccode) first ffirst


*Table 5A - Tests of Heteroskedasticity in Table 2 IV Esimates
by ccode: gen lagsub = l.subUSD
ivreg2h poly5dv l.oilincome (lagsub = ) l.lincome_new WBgrow_new L.poly5 pastpoly5 g1-g23 aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode) gen(new,replace)
ivreg2 poly5dv l.oilincome (lagsub = l.lincome_new WBgrow_new L.poly5 pastpoly5 g1-g23 aclp_region vtim vtim2 vtim3 newlagsub_L_oilincome_g newlagsub_L_lincome_new_g newlagsub_WBgrow_new_g newlagsub_L_poly5_g newlagsub_pastpoly5_g newlagsub_g1_g newlagsub_g2_g newlagsub_g3_g newlagsub_g4_g newlagsub_g5_g newlagsub_g6_g newlagsub_g7_g newlagsub_g8_g newlagsub_g9_g newlagsub_g10_g newlagsub_g11_g newlagsub_g12_g newlagsub_g13_g newlagsub_g14_g newlagsub_g15_g newlagsub_g16_g newlagsub_g17_g newlagsub_g18_g newlagsub_g19_g newlagsub_g20_g newlagsub_g21_g newlagsub_g22_g newlagsub_g23_g newlagsub_aclp_region_g newlagsub_vtim_g newlagsub_vtim2_g newlagsub_vtim3_g)  if Vsamp==1, robust 
ivhettest 

ivreg2h poly5dv oilcm l.oildev (subUSDcm  = ) l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 aclp_region g1-g23 vtim vtim2 vtim3 if Vsamp==1, cluster(ccode) gen(new,replace)
ivreg2 poly5dv (subUSDcm = oilcm l.oildev l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 aclp_region g1-g23 vtim vtim2 vtim3 newsubUSDcm_oilcm_g newsubUSDcm_L_oildev_g newsubUSDcm_L_subUSD_dev_g newsubUSDcm_meanpoly5dv_g newsubUSDcm_L_lincome_new_g newsubUSDcm_WBgrow_new_g newsubUSDcm_L_poly5_g newsubUSDcm_pastpoly5_g newsubUSDcm_aclp_region_g newsubUSDcm_g1_g newsubUSDcm_g2_g newsubUSDcm_g3_g newsubUSDcm_g4_g newsubUSDcm_g5_g newsubUSDcm_g6_g newsubUSDcm_g7_g newsubUSDcm_g8_g newsubUSDcm_g9_g newsubUSDcm_g10_g newsubUSDcm_g11_g newsubUSDcm_g12_g newsubUSDcm_g13_g newsubUSDcm_g14_g newsubUSDcm_g15_g newsubUSDcm_g16_g newsubUSDcm_g17_g newsubUSDcm_g18_g newsubUSDcm_g19_g newsubUSDcm_g20_g newsubUSDcm_g21_g newsubUSDcm_g22_g newsubUSDcm_g23_g newsubUSDcm_vtim_g newsubUSDcm_vtim2_g newsubUSDcm_vtim3_g), robust
ivhettest

*Table 6A - Controlling for Total Social Expenditures, Oil Price, and Foreign Reserves
*Model 11
probit poly5dv l.oilincome l.subUSD l.totwelshare l.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)

*Model 12
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev l.totwelshare meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	

*Model 13
probit poly5dv subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 l.oil_price_2000 l.reserves_mths i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)


*Table 7A - Excluding backsliding observations
*First identify backsliding observations in Model 4
tab poly5ch
gen back = 0
replace back = 1 if poly5ch<0 & poly5ch!=.
tab back 

*Model 14
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1 & back!=1, cluster(ccode)	

*Model 15
ivreg2h poly5dv oilcm l.oildev (subUSDcm  =  ) l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 aclp_region g1-g23 vtim vtim2 vtim3 if Vsamp==1 & back!=1, cluster(ccode) first ffirst

*Table 8A - Descriptive Statistics
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	
gen dessamp=1 if e(sample)
sum poly5dv oilcm oildev subUSDcm subUSD_dev meanpoly5 lincome_new WBgrow_new poly5 pastpoly5 aclp_region  poldv vtim vtim2 vtim3 pdiff0 pdiff0cm pdiff0dev if dessamp==1


***Items from supplemental file, part B

*Model 17 - limit to 2000 - 2010, where the degree of country interpolation is minimized
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1 & year>=2000 & year<=2010, cluster(ccode)	

*Model 18 - limit to 2002-2009, which all comes from one source (IMF)
probit poly5dv oilcm l.oildev subUSDcm l.subUSD_dev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1 & year>=2002 & year<=2009, cluster(ccode)	

*Model 19 - uses alternative calculation of subsidies per capita with MI methods to fill in missing values of domestic gas prices
probit poly5dv oilcm l.oildev subUSDcm3 subUSDdev3 meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3 if Vsamp==1, cluster(ccode)	


***Use dataset that uses multiple imputation to fill in missing values of all variables
*Note the call to a new dataset

clear
use "C:\data\Subsidies\MI_ISQRepData.dta"

mi estimate: probit poly5dv oilcm l.oildev subUSDcm subUSDdev meanpoly5 L.lincome_new WBgrow_new L.poly5 pastpoly5 i.year aclp_region vtim vtim2 vtim3, cluster(ccode)

